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In this paper we present a graph-theoretic polynomial algorithm which has positive probability of finding a Hamiltonian path 
in a given graph, if there is one; if the algorithm fails, it can be rerun with a randomly chosen starting solution, and there is 
again a positive probability it will find an answer. If there is no Hamiltonian path, the algorithm will always terminate with 
failure. We call this a Successful Algorithm because it has high (close to 1) empirical probability of success and it works in 
polynomial time. Some basic theoretical results concerning spanning arborescences of a graph are given. The concept of a 
ramification index is defined and it is shown that ramification index of a Hamiltonian path is zero. The algorithm starts with 
finding any spanning arborescence and by suitable pivots it endeavors to reduce the ramification index to zero. Probabilistic 
properties of the algorithm are discussed. Computational experience with graphs up to 30000 nodes is included. 


Hamiltonian path problem * ramification index * polynomial algorithm 


1. Introduction 


The Hamiltonian cycle problem is the problem 
of finding a path in a graph which passes through 
each node exactly once. This problem is well known 
and has been discussed in most graph theory books 
such as [1,3,5]. 

A polynomial algorithm will be presented in 
this paper which has positive probability of find- 
ing a Hamiltonian path in a given graph is there is 
one. If a graph has a Hamiltonian path, this 
algorithm will either find it or end with a message 
that it cannot proceed further. However, if there is 
no Hamiltonian path in the graph, the algorithm 


The original title of this paper was ““A polynomial algo- 
rithm which has positive probability of solving a Hamiltonian 
path problem”. 

It was prepared as part of the activities of the Management 
Science Research Group, Carnegie-Mellon University, under 
contract No. N00014-82-K-0329 NR 047-048 with the U.S. 
Office of Naval Research. Reproduction in whole or part is 
permitted for any purpose of the U.S. Government. 


will always end with the ‘cannot proceed further’ 
message. 

Despite the fact that the algorithm can fail, we 
have found it to be of great practical value due to 
the following reasons: 

— It takes only 2 few milliseconds to solve a 
problem of 100 nodes. Thus it is more efficient to 
repeatedly run this algorithm, with different ran- 
dom starting solutions, in the hope that eventually 
an answer may be obtained rather than use other 
large scale algorithms [2,4,7,8]. 

— The algorithm has solved problems having 
30:000 nodes. Such a large problem takes less than 
one minute of DEC-20 CPU time. Our method is 
constrained by memory limitations and not by 
computation time. it would be possible to over- 
come the memory limitation by packing data or by 
making use of secondary storage devices. 


2. Exact statement of the problem 


Let G=(V, A) be a directed graph with V as 
the set of nodes and A as the set of arcs. Ghasn 
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nodes and they are numbered sequentially from 1 
to n. The indegree of node 1 is zero. The outdegree 
of node n is also zero. The problem is to find the 
HP from node 1 to node n which goes through all 
the intermediate nodes exactly once. 


2.1. The Hamiltonian cycle problem 


The problem of finding a Hamiltonian cycle in 
a directed graph can be easily converted to a 
problem of finding a HP as shown in Figure 2.1. 
We split any node of the graph into two nodes so 
that one node has no incoming arc and the other 
has no outgoing arc. The Hamiltonian path be- 
tween these two nodes would be a Hamiltonian 
cycle in the original graph. In Figure 2.1 HP 
between nodes 1 and 1a is equivalent to a HC. 
Thus we are solving the general Hamiltonian path 
problem rather than that for a specific type of 
graph. 


2.2. The postman pickup problem 


This is a problem which can be translated into a 
Hamiltonian path problem. A postman has to pick 
up one letter from each of n locations and deliver 
them to the main post office. Starting from his 
home, how can he do this while covering the 
minimum possible distance? 

Let node 1 represent his home and node n 
represent the post office. Nodes 2 through n—1 
are the locations from which letters have to be 
picked up. The Hamiltonian path from node 1 to 
node n is a possible path on which the postman 
should travel in order to minimize his total dis- 
tance traveled. 
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3. Preliminary concepts and definitions 


We give some preliminary definitions with in- 
terpretations related to the postman pickup prob- 
lem. 


Definition 1. An arborescence is a directed tree 
having one node called the root node which can be 
reached by a unique directed path from any node. 


For the postman pickup problem, the root node 
is the post office. 


Definition 2. Let G =(V, A) be a directed graph. 
Then arborescence T=(V, Ay) is a spanning 
arborescence of G if A; is a subset of A. 


Definition 3. Let i and j be two nodes of a span- 
ning arborescence of G and (i, j)@ A — Ap. 

1. If there exists a directed path in T from i to / 
then (i, j) is said to be an up arc. 

2. If there exists a directed path in T from to i 
then (i, /) is said to be a down are. 

3. Otherwise (i, j}) is called a cross arc. 


Definition 4. If i, j€ V and (i, j)€A,, then the 
predecessor node of i is said to be j or symbolically 
P(i)=j. 

One can interpret predecessor of i to be like the 
‘father’ of i in the sense of a family tree. 


Definition 5. A node i € V is said to be a junction 
node of arborescence T= (V, A,) if it is the end 
node of at least two arcs belonging to A;. 


Fig. 2.1. Converting a Hamiltonian cycle problem to a Hamiltonian path problem. 
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Definition 6. A node i € V is said to be a beginning 
node if it is not the end node of any arc € A;. 


Definition 7. For each i € V we define the succes- 
sor set V.={h: p(h)=i)}. Thus ¥, is the set of 
nodes which are immediately below i, i.e. the set of 
nodes for which i is the predecessor node. 


Lemma 8. If |\‘¥,|| = 0 then i is a beginning node. 
Lemma 9. If ||'Y;|| > 2 then i is a junction node. 


Definition 10. The successor function of i repre- 
sented by s(i), for i€ V is defined inductively as 
follows: 


s(i)=1+ ¥ s(h). 


ney, 


In other words, s(i)= 1+ number of nodes in 
T below i. For the postman pickup problem s(i) is 
the number of letters the postman will carry after 
leaving node i. 


Lemma 11. If i is a beginning node then s(i)= 1. 


Definition 12. The ramification index R of an 
arborescence T = (V, A,) is defined as follows: 


R=R(T)=n*(n-1)/2-— YF s(i). 


ieV—{n) 


Lemma 13. The maximum possible ramification 
index of an arborescence is (n — 1) * (n — 2)/2. 


Proof. The arborescence for which the ramifica- 
tion index is maximum is the one for which the 
only junction node is the root node and each of 
the n—1 arcs in A; has the root node as its end 
node. For this case the values of the successor 
function at each node i for i= 1,...,2 —1 is 1 and 
we get from Definition 12: 


R=n*(n—1)/2-(n-1) 
=(n-1)*(n-2)/72 


Definition 14. Let J be the set of junction nodes of 
arborescence 7. The ramification index of a junc- 
tion node j & J is defined as: 


R,= > s(i,) * s(i2). 


fea <i, 
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The significance of Definition 14 is due to the 
following theorem. 


Theorem 15. The ramification index of an arbores- 
cence T can be computed from the following equa- 
tion: 
R=R(T)= ¥ R;. 

jes 


Proof. Assume T= (V, A;) has n nodes. We will 
first prove the proposition for the case in which T 
has only one junction node j. The nodes above j do 
not contribute to the ramification index because, 
as will be seen in Theorem 16 (4 = 5), a tree with 
no junction nodes has zero ramification index. Let 
S, to s,, be the successor functions of the m arcs 
coming into j. Note that j=” implies s, + s,+ 
> +5,,=n—1. According to Definition 12, 

R=R(T)=ne(n-1)/2- Y s(i) 


ie¢V—{n)} 


or 


n-1 mis 
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= 5, * S$. + (5, +5.) * 534+ -°° 


+ (8, +594 +¢* $5421) * Se 


The last expression is equivalent to the expres- 
sion in Definition 14. 

We have proved the proposition when there is 
just one junction node in T. By induction, assume 
that the proposition is true for up to k — 1 junc- 
tion nodes in 7. We will now show that the 
proposition is also true for k junction nodes in T. 

Let J be the set of k junction nodes of T with j 
designating the top most junction node. We again 
assume that j =n. The proof is stated for the case 
in which (i,,/) and (i2,/) are the only two arcs 
coming into j. Let successor functions of i, and i, 
be s, and s, respectively. Let V, denote all the 
nodes below and including i, and V, denote all the 
nodes below and including i,. Furthermore, let J, 
and J, denote the sets of junction nodes below i, 
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and i, respectively. Then by Definition 12 
R(T) =(s, +52. +1)(s, +5,)/2 


- ¥ s(i)- © s(i). 


ieV, iev, 
Since ||J,|| << — 1 and ||J,||< —1, we can write 
R(T) =(s, +5,)(s, +52 +1)/2 


- [sis +1)/2-¥ a 


JES, 


- [alt na x n 
JES, 
=5,* So+ > R,+ > R; 
SEI, JES, 


=) R;. 


fet 


This completes the proof. 


If there were more than two arcs entering j, the 
proof would simply require more summation terms. 


Theorem 16. Let T=(V,A;) be a spanning 
arborescence of a directed graph G=(V, A). The 
following statements are equivalent: 

1, T is a Hamiltonian path starting from node 1 
and ending at node n. 

2. There is a directed path in T from node 1 to 
any other node x G V. 

3. T has node 1 as its only beginning node. 

4. T has no junction nodes. 

5. T has zero ramification index. 


Proof. The proofs for the implications 1 = 2 = 3 
=> 4 are obvious. 

4= 5: The assertion follows directly from The- 
orem 15. Because J = @ there are no terms in the 
first summation and thus the ramification index 
must be zero. 

5 = 1: Because the ramification index of T is 0, 
from Definition 12, we have 

YY s(i)an*(n-1)/2. 
ieEV-{n)} 
Since ali s(i) are integer, the only way in which 
this is possible is if there is an ordering of the 
nodes with s(i,;)= 1, s(i.)=2,...,s(i,_,J=n—1. 
Because the number of successors increases 
sequentially, it follows that T must be a Hamilto- 
nian path. Because the indegree of node 1 is 0, 
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i, = 1. Also because the outdegree of n is 0 we can 
choose i,, = 7. 


4. A polynomial algorithm which has positive proba- 
bility of finding a Hamiltonian path (if there is one) 


Let G=(V, A) be a directed graph with n nodes 
as described in Section 2. The algorithm to be 
described makes use of Theorem 16 which says 
that the ramification index of a Hamiltonian path 
is zero. We first find a spanning orborescence of G 
denoted by 7 by constructing a greedy starting 
solution. Call the exchange of one arc in the 
arborescence with one arc not in the arborescence 
as one pivot. By performing a sequence of pivots 
we try to reduce the ramification index of T and 
when (and if) the ramification index of T becomes 
0, we have found a Hamiltonian path. 

The algorithm can get stuck at a given arbores- 
cence T with R(T)>0 if it is unable to find a 
pivot which will result in a decrease in the ramifi- 
cation index. This may mean: (a) that the graph 
has no Hamiltonian path, or (b) that the algorithm 
made an unfortunate choice of pivots which led it 
to get stuck. The alternative to stopping in this 
case is to choose a different starting solution and 
go through the algorithm again. It can be easily 
shown that there exists a starting solution which 
will lead to a Hamiltonian path, if there is one, 
after a finite number of pivots. 

For ease of exposition, we present the algorithm 
in three separate parts. 


4.1. Algorithm 1: Finding an initial spanning 
arborescence 


We want to find a spanning arborescence T= 
(V, A;) of a directed connected graph G = (V, A). 
For this purpose we define M which we call the set 
of marked nodes. This algorithm begins with A; = 
@ and ends when set A; has been completely 
defined. 

Step 0: Set M = {n}, A; =9, and k =0. 

Step 1: Choose any arc (i,j)€A_ such that 
i€éMandjeM. 

Step 2: SetM = MU {i}, A;=47U {(i,/)} and 
k=k+1. 

Step 3: If k=n—1 then STOP, else go to 
Step 1. 

The above is the greedy form of the algorithm 


Volume 3, Number 1 


for finding a spanning arborescence because it 
chooses as many as possible of those arcs to enter 
the solution which have their end points already 
marked to belong to 7. A random greedy starting 
arborescence can be found by letting p be any 
number satisfying 0 < p < 1 and altering Step 2 as 
follows: 

Step 2*: Let x be a random number between 0 
and 1: if x <p go to step 1, otherwise set M = M 
U{j} and Ap = AzU (i,j). 


4.2. Algorithm 2: Calculating the change in 
ramification index 


Given a spanning arborescence, it is possible to 
compute its ramification index either by using 
Definition 12 or else by using the equations of 
Theorem 15. However we now present a method of 
calculating the change in ramification index which 
is much faster than those methods. 

Given an arborescence, a cross arc (i, j) is to be 
brought into the solution. Let there be m, arcs on 
the path from node i to node n and m, arcs on the 
path from node j to node n. Let the successor 
function of i be denoted by 6. Noting that an 
increase in successor function means an equal 
decrease in the ramification index, consider what 
happens when the cross arc (i, /') is brought into 
the solution: 

1. The successor function of each of the nodes 
on the path from node i to node n (excluding node 
n) goes down by 6 so that the corresponding 
change in ramification index is 


AR, =m, * 6. (1) 
2. The successor function of each of the nodes 

on the path from node to node n (excluding node 

n) goes up by 5 so that the corresponding change 

here is 

AR, = —m,+*6. (2) 


3. Finally, the extra arc (i, /) is added to the 
arborescence giving us 


AR,=—8. (3) 
The sum of equations 1 to 3 is the total change in 
ramification index if arc (i,j) is brought into the 


solution. We have now proved the following theo- 
rem. 


Theorem 17. The change in ramification index of an 
arborescence due to the introduction of a cross arc 
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(i, j) into the solution is given by 
AR=(m,-—m,-1)*8 


where 

m, is the number of arcs on the path from node i 
to node n, 

m, is the number of arcs on the path from node j 
to node n, and 

6 is the successor function of node i before arc(i, j) 
is brought into the solution. 


Now Algorithm 2 can be stated very simply: 

Step 1: Find a cross arc. 

Step 2: Using Theorem 17 find the change in 
ramification index if that arc were to be brought 
into the solution. 


4.3. Algorithm 3: Finding a Hamiltonian path 


Step 0: Find initial solution. Find a directed 
rooted spanning arborescence of G with n as the 
root node by using Algorithm 1. 

Step I: Calculate the successor function and 
ramification index of starting arborescence T= 
(V, Ar). Using Definition 10 calculate s(i) for all 
ie V. Then from Definition 12 or from Theorem 
15 calculate R(T). If R(T)=0: STOP - Tisa 
Hamiltonian path. Else go to next step. 

Step 2: Find the incoming arc. For each cross 
arc (i, 7) € A — A; calculate the ramification index 
if that arc is brought iato A; using Algorithm 2. 
Let R,.y, be the miniraum of these ramification 
indexes and let (i,, j,) be the corresponding arc. 

Step 3: Check possible cases. If 

Rew = 9: STOP - A Hamiltonian path has 
been found. 

Ryew 2 R: STOP — The algorithm failed to find 
a Hamiltonian path on this trial. If another trial is 
desired go back to Step 0 and generate a new, 
different spanning arborescence. 

Riuew < R: Go to next step. 

Step 4: Perform pivoting.- Bring the cross arc 


(i, J,) having ramification index R,., into the 


solution by setting 4, =A; —(j;,k)EG Ar} U 
{(i,,j,)}. Let R=R,,,,. Update the successor 
function. Go to Step 2. 


Theorem 18. Algorithm 3 is a polynomial algorithm. 


Proof. The maximum possible foop value of an 
arborescence is of the order of O(n?) [Lemma 13]. 
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We decrease the loop value by at least 1 in each 
pivot, and each pivot step is linear in the number 
of arcs plus the number of nodes, so that Algo- 
rithm 3 is also O(n7). 


4.4. Algorithm 4; Finding a maximal R spanning 
arborescence 


It is interesting to note that with minor modifi- 
cations Algorithm 3 can be used to find a span- 
ning arborescence with maximal ramification 
index. The modified algorithm is described now. 

Step 0: Find initial solution. Find a spanning 
arborescence of G using Algorithm 1. 

Step 1: Calculate successor function and ramifi- 
cation index of the starting arborescence T= 
(V, A,). Using Definition 10 calculate s(i) for all 
ie V. Then from Definition 12 or from Theorem 
15 calculate R(T). 

Step 2: Find incoming arc. For each cross arc 
(i, j)€A — A, calculate the ramification index if 
that arc is brought into A; using Algorithm 2. Let 
Rew be the maximum of these ramification inde- 
xes and let (i,, /,) be the corresponding arc. 

Step 3: Check possible cases. If 

Rie <R: STOP - A maximal R spanning 
arborescence has been found. 

Raew < R: Go to next step. 

Step 4: Perform pivoting. Bring the cross arc 
(i,,j,) having ramification index R,,,, into the 
solution by setting A;=A;— {(j,, 4): (4, KE 
Ar} U {(i,,j,)}. Let R= R,.,. Update the succes- 
sor function. Go to step 2. 

There are applications in which spanning 
arborescences which have maximal ramification 
index can be useful. One such application is the 
design of computer terminal networks in which 
signal concentrators are located at junction nodes 
[6]. 

It is obvious that a random choice of incoming 
arcs can be made in Step 2 of each of the last two 
algorithms by letting R,,.,, be any incoming arc 
which decreases the ramification index in Algo- 
rithm 3 (or increase the ramification index in 
Algorithm 4). 


5. Computational experience 


We ran Algorithm 3 on Directed Rectangular 
Lattice Graphs (DRLG) because they are easy to 
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Fig. 5.1. Example of a 4X4 directed rectangular lattice graph. 
A Hamilton path is indicated by the darkened edges. 


generate and it is possible to predict whether there 
is or is not a Hamilton path in a given DRLG. 
DRLG’s are planar graphs which look like a grid 
as shown in Figure 5.1. They always have a Hamil- 
ton path when the number of nodes is even before 
node 1 is split. See [9] for details. 

The computational experience is shown in Table 
5.1. 

We now make some observations about the 
computational results. 

~ The algorithm never failed on any DRLG 
which had a Hamiltonian path. 

~ The algorithm was coded in FORTRAN and all 
runs were performed on the DECSYSTEM-20 at 
Carnegie-Mellon University. 

~ The starting value of ramification index is 
about 98% of the maximum possible value of 
ramification index [Lemma 13]. A large value of R 
indicates a bushy tree which is a good start since it 
has less chance of getting stuck. 

~ According to Theorem 18, the upper bound 
on the number of pivots is the maximum possible 
ramification index. However, we can see from 
Tables 5.1 and 5.2 that the number of pivots 
required to obtain the solution is much less than 
the starting value of the ramification index. 

~— When a random starting solution is chosen, 
Algorithm 3 takes little more than twice the com- 
puting time compared to when a greedy starting 
solution is used. The number of pivots required is 
also double. Why this is to be expected is explained 
next. 
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Table 5.1. 
Computational experience for Algorithm 3 with a greedy start 
Grid # Nodes # Arcs Starting R # Pivots CPU Time (sec) 
10 100 180 4080 24 0.062 
20 400 760 72360 99 0.142 
30 900 1740 378840 224 0.292 
40 1600 3120 1217520 399 0.631 
50 2500 4900 3002400 624 1.138 
60 3600 7080 6267480 899 1.868 
70 4900 9660 11666 760 1224 2.792 
80 6400 12640 19974240 1599 4.113 
90 8100 16020 32083920 2024 5.688 
100 10000 19800 49009800 2499 8.118 
110 12100 23980 71 885 880 3024 10.365 
120 14400 28560 101 966 160 3599 14.232 
130 16900 33540 140624640 4224 17.531 
140 19600 38920 189355 320 4899 21.587 
150 22500 44700 249772200 5624 26.204 
160 25600 50800 323609 280 6399 39.976 
170 28900 57460 412720560 7224 43.856 
174 30276 60204 453079992 7568 38.434 


- A greedy starting solution for a DRLG hap- 
pens to be a good starting solution having a struc- 
ture reasonably close to a Hamiltonian path. As 
can be seen from Table 5.1, it is even possible to 
predict the number of pivots required by a subse- 
quent problem using the formula 


Change in number of pivots 
= One-fourth the change in number of nodes. 


For a random start, there is no regular pattern. 

- Number of CPU seconds required per pivot 
per node is a constant indicating empirically that 
each pivot requires a linear amount of work. 

- The CPU times given here cannot be com- 
pared with other computational results. Reference 
(3} has computational experience for problems 
having up to only 60 nodes whereas [6] has no 
computational experience. 


Table 5.2 
Computational results for random starting solution 


# Nodes # Arcs Starting R # Pivots CPU Time (sec) 


100 180 4064 46 0.118 
400 760 72008 177 0.442 
900 1740 337488 481 1.773 
1600 3120 1213976 852 7.474 
2500 4900 2994828 1323 11.965 
3600 7080 6251360 2151 22.966 
4900 9660 11654952 2889 35.652 
6400 12640 19949896 3994 86.459 


6. Conclusions 


— We have devised a way of improving the 
performance of Algorithm 2 which is expected to 
decrease the CPU times noted in Tables 5.1 and 
5.2. 

— We will present a proof elsewhere that Algo- 
rithm 3 will never fail when applied to a DRLG 
with an even number of nodes (before any trans- 
formations are performed). A DRLG with odd 
number of nodes does not have any HP so that the 
algorithm will always fail. 

~ Algorithm 3 as stated is very inefficient for 
undirected graphs. We have developed a modifica- 
tion of Algorithm 3 which has proved to be very 
effective for finding a Hamiltonian path on an 
undirected graph. The results will be reported in a 
later paper. 
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